A RULE OF THUMB FOR RIFFLE SHUFFLING 



SAMI ASSAF, PERSI DIACONIS, AND K. SOUNDARARAJAN 

Abstract. We study how many riffle shuffles are required to mix n cards if only certain 
features of the deck are of interest, e.g. suits disregarded or only the colors of interest. 
For these features, the number of shuffles drops from | logj n to logj n. We derive closed 
formulae and an asymptotic 'rule of thumb' formula which is remarkably accurate. 



1. Introduction 

In this paper we study the mixing properties of the Gilbert-Shannon-Reeds model for 
riffle shuffling n cards. Informally, the deck is cut into two piles by the binomial distribution, 
and the cards are riffled together according to the rule: if the left packet has A cards and 
the right has B cards, drop the next card from the left packet with probability A/{A + B) 
(and from the right packet with probability B/{A + B)). Continue until all cards have been 
dropped. This defines a measure, denoted (52(c)) on the symmetric group Repeated 
shuffles are defined by convolution powers 



(1) Qti^)= 5]g2(r)Qf-'^(aT 



Qf{a) 



The uniform distribution is U{(t) = \/n\. There are several notions of the distance between 
Q2' and U: the total variation distance 

(2) WQt - U\\tv = max \Qt{A) - U{A)\ = \Y. l^^'l^) " U{a)l 
and the separation and l^o metrics 

(3) SEP(A:) = max 1 - 9lSZl. ^ [(k) = max 

o- (7((T) o- 

In widely cited works, Aldous [2] and Bayer and Diaconis [5] show that |log2(n) + 
c shuffles are necessary and sufficient to make the total variation distance small, while 
21og2(ra) + c shuffles are necessary and sufficient to make separation and loo small. 

The distances in ([2]) and ([3]) look at all aspects of a permutation. In many card games, 
only some aspects of the permutation matter. For example, in Black-Jack and Baccarat, 
suits are irrelevant and all lO's and picture cards are equivalent; ESP card guessing experi- 
ments use a Zener deck of 25 cards with each of 5 symbols repeated five times. It is natural 
to ask how many shuffles are required in these situations. These questions are studied by 



U{a) 
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Conger and Viswanath [101 [T3l \T2\ [TT] who derive a framework, new formulae and remark- 
able numerical procedures giving useful answers for cases of practical interest. Their work 
is reviewed at the end of this introduction. 

In this paper, we develop formulae and asymptotics for a deck of n cards with Di cards 
labelled 1, D2 cards labelled 2, . . ., Dm cards labelled m. Most of the results are proved 
from the deck starting 'in order', i.e. with I's on top through m's at the bottom. In Section 
[5l we show that initial order can change the conclusions. 

In Section [21 we begin with Di = 1 and D2 = n — 1. The transition matrix for this case 
has interesting properties, rivaling the 'Amazing Matrix' in [26]. Extending work of J.C. 
Reyes [28] , we show that log2 n + c shuffles are necessary and sufficient for convergence in 
any of our metrics. 

Section [3] studies Di = R, D2 = B, with, for example, R = B = 26 modeling the red- 
black pattern for a standard 52 card deck. We derive a simple formula, first proved in [13] 1 
for Q2^{w) for any pattern w and use this to again show that log2 n + c steps are necessary 
and sufficient for convergence to uniformity. We find this surprising as following a single 
card involves a state space of size n, reds and blacks involves a state space of size („"2)' ^'^'^ 
yet the same number of shuffies are needed. 

In Section H] we treat the general case, deriving a formula which can be used for some 
limited calculations. We also reprove a result of Conger- Viswanath determining where the 
maximum for SEP and Zoo are achieved. A main result is a unified formula, our rule of 
thumb: 

Theorem 1.1. Consider a deck of n cards with Di cards of type i, 1 < i < m with 
Di > d > 3, n = Di -|- • • • -|- Dm- Then the separation distance after k shuffles is 

2fc(m-l) - 1\ / j ^ n+m-1 



i-(i + ^)r— n — T- — ttE(-i 

(n + 1) ■ ■ ■ (n+m — l) 



(n + l) ■ ■ ■ (n+m — l) 
where rj is a real number satisfying 



. /( . 1(1 

j=0 



m—l 



This result does not depend on the individual details of the Di and shows that the same 
number of shuffles are necessary and sufflcient for a variety of questions. For numerical 
approximation, we set 7/ = and simply compute the single sum. The bound on 77 gives 
explicit error estimates. We demonstrate that the rule of thumb is accurate for both sin- 
gle card and red-black problems studied in earlier sections. Some numerical results are 
summarized below. 

Remarks on Tahle\^ The first row gives exact results from the Bayer-Diaconis formula for 
the full permutation group. The other numbers are from the rule of thumb. The single card 
or red-black numbers show that 6 shuffles achieve the the same separation as 12 shuffles for 
the full deck. The Black-Jack (equivalently Baccarat) numbers suggest a savings of two or 
three shuffles, and the suit numbers lie in between. The final row is the rule of thumb for 
the Zener deck with 25 cards, 5 cards for each of 5 suits. 

As explained at the end of Section [H the proof of Theorem 11.11 results from an approx- 
imation to an m-fold iterated sum. Direct evaluation of this sum was sometimes possible 
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Table 1. Rule of Thumb for the separation distance for k shuffles of 52 cards. 



k 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


BD-92 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


.995 


.928 


.729 


.478 


.278 


blackjack 


1.00 


1.00 


1.00 


1.00 


.999 


.970 


.834 


.596 


.366 


.204 


.108 


.056 




1.00 


1.00 


.997 


.976 


.884 


.683 


.447 


.260 


.140 


.073 


.037 


.019 


rcdblack 


.962 


.925 


.849 


.708 


.508 


.317 


.179 


.095 


.049 


.025 


.013 


.006 




1.00 


1.00 


.993 


.943 


.778 


.536 


.321 


.177 


.093 


.048 


.024 


.012 



though it became intractable for many cases of interest. A referee points out that we show 
that the sum is a coefficient in a product of polynomials. Fast multiplication of polyno- 
mials results in a useful polynomial time algorithm. One may ask, if the numbers can be 
computed exactly, why bother with limit theorems and approximations? One answer comes 
from understanding; The red/black and single card configurations have very similar behav- 
iors. This is surprising and calls out for explanation. The rule of thumb formula explains 
why many different configurations require the same number of shuffles. 

In an appendix, we show that the processes studied below are quotient walks with respect 
to Young subgroups of 5„. We show how representation theory can be used to derive results 
for features of the random transposition random walk. 

Literature review of riffle shuffles. The basic shuffling model was introduced by Gilbert 
and Claude Shannon in an unpublished report [25]. The model was independently intro- 
duced and studied by Jim Reeds in unpublished work [27] . The first rigorous results are by 
Aldous [T] who showed that asymptotically | log2(n) shuffles are correct for total variation. 
Separation distance is introduced in connection with stopping time arguments in Aldous 
and Diaconis [2]. They show that 2 log2 n+c steps are necessary and sufficient for separation 
convergence. The cutoff phenomena is first noticed in this paper as well. Recent work on 
the cutoff phenomenon is in [HJ [2^ I16j . Our work below adds several new examples to the 
list of problems where the cutoff can be explicitly determined. 

A generalization to a-shuffies is introduced by Bayer-Diaconis in [5j . Here the deck is cut 
into a packets by a multinomial distribution, and then cards are dropped from packets with 
probability proportional to packet size. Letting Qa{(T) denote this measure, they show 

(4) Qa*Qb = Qab- 

Thus it is enough to study a single a-shuffie. The main result of their paper is the simple 
formula 

/n+a—r\ 

(5) Qa{o) = 

a" 

where r = r{a) is the number of rising sequences in a (r(o") = d{(T~^) + 1 with d the number 
of descents in cr). This allows simple closed form expressions for a variety of distances. 

A number of extensions and variations have since developed. We will not survey these 
here (see [15] for a thorough treatment) but mention that features of permutations are shown 
to achieve the correct limiting distribution in fewer shuffies. For example, | log2 n + c suffice 
for the longest increasing subsequences [21] , log2 {n) for the descent structure |16j , A;„ — s- oo 
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arbitrarily slowly for the cycle structure [T8j and a single shuffle suffices for the longest 
cycle [18j. A recent addition is the work of Chen and Saloff-Coste [8] studying random 
combinations of a-shuffles for randomly varying a. 

Mark Conger and D. Viswanath study the same type of problems as we do, where cards are 
identified under the action of a subgroup. In ^10], they lay out the basic problems, develop 
a formalism for calculations involving descent polynomials (a generalization of Eulerian 
polynomials), and use these to derive a closed formula for the chance of a given arrangement 
after an a-shuffle for decks labelled {1,2,..., h, x"}. This includes both our single card case 
and the full deck case. They show that the probability of an arrangement is 



with r the number of cards labelled c, 1 < c < h, that are not preceded by a card labelled 
c — 1 and / the number of cards labeled x that precede the card labeled h. This elegant 
expression can be analyzed asymptotically using the analytic techniques of Sections 2-5 
below. Their main results pertain to red-black decks where they derive equivalence relations 
on configurations that have the same probability. They point out that starting with the reds 
on top or reds alternating with blacks can lead to different conclusions. In a preliminary 
version of , they give an earlier proof of the exact formula for red-black decks found in 
Theorem 13.11 below and good asymptotic approximations to the total variation distance for 
following a single card. While this takes log2(n + c), they also prove the surprising result 
that the seemingly similar problem of randomizing the current top card takes half this many 
shuffles. 

In [12], the authors use their earlier work on descent polynomials to develop a fascinat- 
ing Monte Carlo procedure for approximating the total variation distance. Our exact and 
asymptotic calculations overlap theirs in many places, and in every case we find their num- 
bers spot on. This leads us to accept their estimates for problems of deck hands at bridge 
where we have not found a way to do exact calculations. The algorithms in [101 112] can be 
used to give polynomial time procedures for exact calculation of the numbers in Table [2j 
The authors have also proved some complexity results showing that exact computation is 
intractable for some of these problems. They have used their algorithms to calculate an 
exact version of Table [2] above. Our numbers are based on our rule of thumb. Their exact 
numbers agree to the accuracy given except that for fc = 1, 2 in the red-black category they 
get .8898... and .8897... and our approximation gives .962 and .925, respectively. We are 
impressed (and thankful) for both their accurate algorithms and for the accuracy of the rule 
of thumb. It is also worth reporting that their stochastic approximation works in reasonable 
time to give useful approximations for the analog of Table [2] for total variation distance. 

The results derived here add to the result of Conger- Viswanath in the following ways. 
First, we present some new formulae (e.g. the transition matrix for single card mixing or the 
red-black formula) which allow exact computations. Second, we derive asymptotic approx- 
imations for a variety of cases. Third, we supplement these formulae and approximations 
with our unifying 'rule of thumb'. 

We mention the broad extensions of riffle shuffling to random walks on hyperplane ar- 
rangements due to Bidigare, Hanlon and Rockmore (see [15j for a survey). The process 
induced by observing which chamber of a sub-arrangement contains the present state of the 
original walk is still Markov. Rates of convergence for these sub-arrangement walks are in 
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A Note on Metrics. This paper focuses on the separation and metrics with some 
attention on the more widely used total variation metric. These metrics are related and 
often give 'about the same answer' asymptotically. For example, [3] [Proposition 5.13], it is 
shown that separation distance s{l) and total variation distance d{l) to uniformity after / 
steps for a transitive Markov chain satisfy: 

d{2l) < s{2l) < 4>{d{l)) 

for ; > 1 provided d{l) < 1/8, with 4>{x) = (1 - (1 - 2x'^/^){l - x^/^)^ 4x^/2 x ^ 0. 
This shows (roughly) that d{l) is small if and only if s(2/) is small. 

In discussing total variation, separation and /qo distances we have sometimes heard the 
comment that these are worst case measures that take their maximizing values on at a 
specific set or point. If this set or point is not a particularly relevant configuration, the 
distance may lose its relevance as well. One can construct artificial examples where this 
argument has merit, but worst case bounds are conservative and the underlying inequalities 
show that they hold for all configurations. We suspect that in the present natural context, 
there are many close by configurations that are also 'off'. This suggests interesting research 
questions. 

Of course, the non asymptotic results depend on the details of the metric. We do not think 
that there is 'one right metric'. Often /qo and separation are easier to bound. Sometimes, 
the tails of the distribution and rare points neglected by total variation and weak star 
metrics are what is of interest. Unbounded functions such as the number of correct guesses 
when cards are turned up sequentially require separate treatment. The closed formulae 
reported here can be used for any of these tasks. 

Several other metrics are in wide-spread use. These include the Chi-square or 1(2) distance 
(useful for reversible chains and eigenvalue arguments). Entropy distance and various weak- 
star metrics such as the maximum distance between the probability of balls in some metric 
or the Wasserstein distance. For discussion and comparison see |241 116j. 

2. Following a single card 

Suppose one notices that the ace of spades is on the bottom of a deck of n cards. 
How many shuffles does it take until this one card is close to uniformly distributed on 
{1, 2, . . . , n}? This problem was studied by J.C. Reyes [28] [Chapters 3 and 5]. He derived 
the eigenvalues given below and also gave a coupling argument that shows that log2(n) -|- c 
shuffles suffice for total variation convergence. A different proof of this result is by Fulman 
[22j [Cor. 3.9], who derives it as a consequence of his work on combining shuffles and random 
cuts. An asymptotic expansion for the total variation appears in Conger and Viswanath 
[12| . This shows that the upper bound cannot be improved. In this section we elaborate 
on these results by studying the transition matrix, giving its eigen values and vectors and 
giving matching upper and lower bounds for the loo, separation and total variation distance. 
As shown in an appendix, under repeated shuffles a single card moves according to a Markov 
chain. We begin by writing down the transition matrix. 

Proposition 2.1. Let Pa{i,j) be the chance that the card at position i moves to position j 
after an a-shujfle. For 1 < i, j < n, Pa{i,j) is given by 
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where r ranges from I = max(0, (i + j) — (n + 1)) to u = min(i — 1, j — 1). 

Proof. We calculate QaiJ, i), the chance that an inverse a-shuffle brings the card at position 
j to position i. For this to occur, the card at position j may be labelled hj k, 1 < k < a. 
Then r cards above this card may be labelled from 1 to k. All will appear before the card 
at position j in ("'~^) ways. The remaining cards above must labelled from k + 1 to a. Here 
< r < min(j — l,i — 1). Also if m cards below position j are labelled from 1 to k — 1, 
then m + r = i — l,m<n — j and so r > (i + j) — (n + 1). Finally, i — 1 — r cards below 
position j must be labelled from 1 to A: — 1 in ways, and the remaining cards must 

be labelled from k + 1 to a. □ 

For example, the n x n transition matrices for n = 2, 3 are given below. 

J_/a + l a-1 
2a\a-l a + 1 

(a + l)(2a + l) 2{a^-l) (a-l)(2a-l) 

2(a2 - 1) 2(a2 + 2) 2{a^ - 1) 
(a-l)(2a-l) 2(a2-l) (a + l)(2a + 1) 

Two other special cases to note are the extreme cases when i = 1 or i = n, which are given 
by 

^ a ^ a 

k=l k=l 

These single card transition matrices are studied by Ciucu [9j who gives a closed form 
for all n when o = 2: 




P2ii,j) 



2^(2^-1+2"-^) if i=j, 

2"-j+i (i-i) if ^ > 
^ P2{n-i + l,n-j + l) iii<j. 



These matrices share many properties of the 'amazing matrix' developed by Holte [26j. 
See also for connections between Holte's amazing matrices and card shuffling. The 
following Proposition is essentially due to Ciucu [9j. 

Proposition 2.2. The transition matrices following a single card have the following prop- 
erties: 

(1) they are cross-symmetric, i.e. Pa{i,j) = Pa{n — i + l,n — j + 1); 

(2) Pa ■ Pb = Pab; 

(3) the eigenvalues are 1, 1/a, 1/a^, . . . , l/a"^"*^; 

(4) the right eigen vectors are independent of a and have the simple form: 
V^ii) = {i- 1)^-1(7-/) + {-ir-'^^{ZZ\) for m > 1. 

Proof. The cross-symmetry (1) follows from Proposition l2.H and the multiplicative property 
(2) follows from the shuffling interpretation and equation ([4]). Property (1) implies that the 
eigen structure is quite constrained; see [30]. Properties (3) and (4) follow from results of 
Cuicu 191. □ 
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Remark 2.3. We note that Holte's matrix arose from studying the 'carries process' of ordi- 
nary addition. Diaconis and Fulman [16j show that it is also the transition matrix for the 
number of descents in repeated a-shuffles. We have not been able to find a closer connection 
between the two matrices. 

From Proposition 12.11 we obtain the following Corollary, which also follows as a special 
case of Theorem 2.2 in |10] . 

Corollary 2.4. Consider a deck of n cards with the ace of spades starting at the bottom. 

Then the chance that the ace of spades is at position j from the top after an a-shuffte is 

(7) QaU) =Pa{n,j) = ^jZik-ir-'y~'- 

k=l 

From the explicit formula, we are able to give exact numerical calculations and sharp 
asymptotics for any of the distances to uniformity. The results below show that log2 n + c 
shuffles are necessary and sufficient for both separation and total variation (and there is a 
cutoff for these). This is surprising since, on the full permutation group, separation requires 
2 log2 n + c steps whereas total variation requires | log2 n + c. Of course, for any specific n, 
these asymptotic results are just indicative. 



Table 2. Distance to uniformity for a deck of 52 distinct cards. 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


TV 


1.00 


1.00 


1.00 


1.00 


.924 


.614 


.334 


.167 


.085 


.043 


.021 


.010 


SEP 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


.996 


.931 


.732 


.479 


.278 






10^1 


1029 


1019 


10^2 


10^ 


10^ 


128 


11.3 


2.57 


.900 


.380 



Table 3. Distance to uniformity for a single card starting at the bottom of 
a 52 card deck. 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


TV 


.873 


.752 


.577 


.367 


.200 


.103 


.052 


.026 


.013 


.007 


.003 


.002 


SEP 


1.00 


1.00 


.993 


.875 


.605 


.353 


.190 


.098 


.050 


.025 


.013 


.006 




25.0 


12.0 


5.51 


2.37 


1.02 


.460 


.217 


.105 


.052 


.026 


.013 


.006 



Remarks on Table 0. We use Proposition 12.11 to give exact results when n = 52. For 
comparison. Table [2] gives exact results for the full deck using [5]. Tables [3] and S] show 
that it takes about half as many or fewer shuffles to achieve a given degree of mixing for a 
card at the bottom of the deck. For example, the widely cited '7 shuffles' for total variation 
drops this distance to .334 for the full ordering, but this requires only 4 shuffles to achieve 
a similar degree of randomness for a single card at the bottom, and only 2 for a single card 
starting in the middle. Similar statements hold for the separation and loo metrics. 
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Table 4. Distance to uniformity for a single card starting at the middle of 
a 52 card deck. 





1 


2 


3 


4 


TV 


.494 


.152 


.001 


.000 


SEP 


1.00 


.487 


.003 


.000 


^oo 


1.92 


.487 


.003 


.000 



For asymptotic results, we first derive an approximation to separation. Since separation 
is an upper bound for total variation, this gives an upper bound for total variation. Finally, 
we derive a matching lower bound for total variation. 

Proposition 2.5. After an a-shuffle, the probability that the bottom card is at position i 
satisfies 

1 a'^-^+i ^ , , 1 a"-^ 

n - Q'^y'l - ~~^ ~^ 

al — a" al — a"^ 

where for brevity we have set a = 1 — 1/a. In particular, the separation distance satisfies 

l-^-^>SEP(a)>l ^ 



a 1 — a" 



o 1 — a' 



n— 1 ' 



Proof. Since k/{k — 1) > a/{a — 1) for all 1 < /c < a we find that 

(8) a^-'Qain) > Qa{i) > a-^'-'^Qa{l). 

Therefore 



l = ^Qail)>Qail)Yl 



a 



Qa(l)aa^-"(l-a"), 



1=1 



so that 



^ , , 1 a"-i 1 q"-i 

Qa{l) < < 



al — a" al — a"^ 
1 1 



Since Qa{n) = QaiX) + 1/^ follows that Qa{n) < - i_^n-i ■ Using ([8]) the desired upper 
bound for Qa{i) follows. 
Similarly, 

1 = Y.Q-^) ^ Qa{n)Y.a^-' = Q,(n)i^, 
^-^ ^-^ 1 — a 

i i=\ 

SO that 

Qa{n) > --r-^- 
a 1 — a"- 

Since Qa{^) = Qa{n) — 1/a it follows that (5a(l) > ^ i-a^ i from ([8]) the desired lower 
bound for Qa{i) follows. From (jl7p and the above estimates we obtain our bounds on 
SEP(a). □ 

If a = 2^°S2(")+c = j22^, then our result shows that the SEP(a) is approximately 
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and for large c this is ~ 2 ^ ^ . The fit to the data in Table [5] is excellent: for example after 

26 
1024 



ten shuffles of a fifty-two card deck we have 1 ^ = tI|j which is very nearly the observed 



separation distance of 0.025. 

Remark 2.6. Proposition 12.51 gives a local limit for the probability that the original bottom 
card is at position j from the bottom. When the number of shuffles is log2 n+c, the density of 
this (with respect to the uniform measure) is asymptotically z{c)e~^^'^'' , with z a normalizing 
constant {z{c) = 1 /2'^ {e^ /"^^ — 1)). The result is uniform in j for c fixed, n large. 

Proposition 2.7. Consider a deck of n cards with the ace of spades at the bottom. With 
a = 1 — 1 /a, the total variation distance for the mixing of the ace of spades after an a-shuffte 
is at most 

a"+i aa2(l-a"-i) 1 , f a 1 - a' 
i L -I log 

1 — q" n(l — a") nlog(l/a) \n 

and at least 

a"" a(l-a") 1 , /al-a"-^ 

+ — ; — TT-TT log 



1 — a"' ^ na(l — a" ^) nlog(l/a) Vn 

Proof. Let Qa{i) denote the probability that the ace of spades is at position i from the top 
after an a shuffle. Note that Qa{i) is monotone increasing in i, and let i* be such that 
Qa{i*) < 1/"- < Qaii* + !)• From Proposition 12.51 we find that i* satisfies 

n-i*+l 1 ^n-i*-l 

(9) <-< 



a{l - a") n ~ a{l -a"-i)' 
so that 

/ al — a"^^ \ ^ a 1 

(10) log —r- < i*(log l/a) < log - 



From Proposition 12.51 we have that the desired total variation is 
1 ^ ,..\ ,i* a"-*+i i* a"~**+i 

( 



2^[--Qa{i)] <--l^^-, ^ = \ ^ 1-a 

^ \n J n ^ a(l — a^) n 1 — a" 



i<i* i<i 

and also 



^ \n J n 1 — a"^ 



i<i* 



Using ([9]) and (jlOp we obtain the Proposition. □ 

Remark 2.8. After log2 n + c shuffles, that is when a = 2^n, Proposition 12.71 shows that the 
total variation distance is approximately (with C = 2'^^ 



Thus when c is 'large and negative,' the total variation is close to 1, and when c is large 
and positive, the total variation is close to 0. Thus total variation and separation converge 
at the same rate. This is an asymptotic result and, for example, Table [3] supports this. 

Remark 2.9. From Proposition 14.11 the l^o distance is achieved for configurations with the 
ace of spades back on the bottom. Prop osition 1 2 . 5 1 gives a formula for this and the arguments 
for Propositions 12.51 and 12.71 show that log2 n + c shuffles are necessary and sufflcient for 
convergence in loo- 
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Remark 2.10. Similar, but more demanding, calculations show that if the ace of spades 
starts at position i, and max(z/n, (n — i)/n) > ^ > for some fixed positive A, then 
I log2 n shuffles suffice for convergence in any of the metrics. We omit further details. 



3. A RED-BLACK DECK 

We focus now on riffle shuffles of a deck consisting of R red cards and B black cards. 
The purpose of this section is to give an explicit description of o-shuffles of the deck with 
initial configuration of red atop blacks. In Bayer-Diaconis [5], the formula describing when 
an a-shuffle of n distinct cards results in a particular permutation has the simple form 

1 f a + n — r 



a" \ n 



where r is the number of rising sequences in the permutation. The analysis for the red-black 
deck is markedly different. One indication of this comes by noticing how likely the reverse 
deck is to occur. In the case of permutations, the reverse deck has n rising sequences, and 
so the Bayer-Diaconis formula dictates that this configuration cannot occur unless a > n. 
However, in the red-black case, the reverse deck (blacks atop reds) may occur after a single 
2-shuffle no matter the deck size. 

We begin by determining a formula for the chance of any arrangement following an a- 
shuffle. This formula was proved earlier in unpublished work of Conger and Viswanath [llj. 
We use the result to derive the numbers in Table [2l It also serves as a simple case of the 
more complex argument in Section [J] which also gives useful asymptotics. 

Theorem 3.1. Consider a deck with R red cards on top of B black cards. The probability 
that an a-shuffle will result in the deck configuration w is 

(11) Qa{w) = ^ ^(fc - lf -^y-\a - kf^\a - k + l)^-''(^) 

" k=i j=i 

where b{j) = bu]{j) is the number of black cards above the jth red card in the deck w. 
Proof. The general formula for the probability of w resulting from an a-shuffle is given by 

(12) E ^(/^""aIp^^^^-!^)' 

where the sum is over all non-negative compositions A = {Ai,A2, . . . ,Aa) R + B, i.e 
Ai > and Ai + A2 + ■ ■ ■ + Aa = R + B, and pr:oh{w\A) denotes the probability that 
w results from successively dropping cards from the piles A^. We break the sum into the 
following two cases: either there exists an integer k such that Ai + A2 + ■ ■ ■ + Af^ = R or 
not. 

Consider the case when the sum of the first k piles is exactly R. Then, the result of the 
subsequent riffle shuffle is equally likely to be any of the possible deck configurations. 

That is to say, given such a cut A, pioh{w\A) = l/(^^^) for every w. Therefore the 
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contribution to Qa{w) from all such cuts is given by 

1 / R + B \ 1 



E 



Ai+-+Aa=R+B 
3 k s.t. AiH \-Ak=R 

a-l R 



a^+B \A,,...,AJ (^+^) 



EE E 



k=lAk=l Afc+i+-+A„=B 
Ai+-+Ak-i=R-Ak 

a-l 



Ak) V^i, . . . , Ak-i) \Ak+i, ...,Aa 



„iFBE(«-*)''(*''-(*-in- 



R-Ak 



k=l 

The choice to let k be the first index such that Ai + ■ ■ ■ + A^ = R is necessary in order 
to avoid over counting compositions with many O's. This choice seemingly breaks the 
symmetry between R and B in the final formulation. However, the symmetric version may 
be obtained by taking k to be the last index such that Ai + ■ ■ ■ + A^ = R. Finally, note 
that since B ^ 0, we may in fact take the sum over k to range from 1 to a. 

Now consider the alternative case when there exists a pile (necessarily unique) containing 
both red and black cards. The assumption on A amounts to the existence of integers k,x,y, 
with 1 < k < a, 1 < X < R, 1 < y < B , such that Ai + ■ ■ ■ + A/^^i = R — x, A/^ = x + y, and 

Ak+i H + Aa = B - y. Given such a cut A, prob(w;|^) = rx,y{w)/ {R_x!^Xy,B-y) ^ where 

'fx,y{w) denotes the number of rising subsequences consisting of x red cards followed by y 
black cards. The resulting contribution to Qa{w) from all such cuts is given by 



f.R+B \ 4, A ) 

A,+...+A,=R+B " \Ai,...,AaJ 

3 k s.t. AiH <rAk-i<R 

and Afc+iH \-Aa<B 

a R B 



k=lx=ly=l AiH \-Ak-i=R-x 

Ak+i+-+Aa=B-y 

^ E E E ^^^y(^)i^ - - ^)^"'- 

k=l x=l y=l 



aR+B 



For the final equation to make sense, we adopt the convention that 0^ = 1. 

Let b{j) denote the number of black cards above the jth red card in w. We may count 
rising subsequences of w by the last red card used in the subsequence, giving the equation 



j=i V 



y 



To see this, note that the first binomial coefficient counts the number choices of x red cards 
before the jth red card, and the second binomial coefficient counts the number of choices 
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for y black cards after the jth red card. Inserting this into the x and y summations above 
gives 

«iL E E E ^^..(^)(^ - - ^)''-' 

k=l x=l y=l 



fc=i j=i 

The probabihty Qa{w) is obtained by adding the expressions in these two cases. Since 

± - if-^k^-Ha -kf=f: k^'\a - kf y: (^) 

k=l j=l k=l j=l ^ ^ 

= ± k^-\a - k)^ ^;Xy^f^. = j:{a - kfik^ -{k- If), 

k=l ^ ' ' k=l 

we obtain the desired expression. □ 

Given (fT3]) . Qa gives a completely explicit description of a-shufSes, though this is difficult 
to evaluate for an arbitrary w. However, there are two special deck configurations for which 
Qa simplifies nicely, namely reds atop blacks (where rx,y{w) = (^)(^)) and blacks atop 
reds (where rx,y{w) = 0). By Proposition 14. the formulae below can be used to give exact 
calculations for separation and Zoo- 

Corollary 3.2. The probability of an a-shuffie resulting in the original deck configuration 
of reds atop blacks is 



\k=l / 

The probability an a-shuffle resulting in the reverse deck configuration of blacks atop reds is 

a-l 



l^Y.i-kfik'^-ik-lf) 



k=l 



Another special case to consider is tracking the position of a single card starting at the 
bottom of the deck. For this case, taking B = 1 and ii = n — 1 in (llip we recover Corollary 
[M 

Note that if instead we consider a single red card, i.e. R = 1 and B = n — 1, starting at 
the top, then the distribution is the same. More precisely, let Qaii) denote the chance that, 
say, the 2 of hearts is at position i from the top of the deck after an a-shuffle. Then it is 
easy to verify that Qa{i) = Qai^ — i + 1), which is just a special case of the cross-symmetry 
in Proposition 12. 2[ 

Finally, consider the case of a single 2-shuffle for an arbitrary red-black deck. In this 
case, the left hand summand of (jlip reduces to a single term evaluating to 1. For the right 
hand summand, note that k = 1 forces x = R, and k = a forces y = B. 
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Corollary 3.3. The probability of a 2-shuffte resulting in a deck configuration w is 

1 



(14) 



Q2{w) 



2R+B 



2h{w) _j_ 2t(w) _ ^ 



where li(w) denotes the number of red cards preceding the first black card in w, and i{w) 
denotes the number of black cards following the final red card of w. 

Equation (|14p can be used to give a simple formula for the total variation after a single 
2-shuffle of a deck with n red cards and n black cards. Here note that any two configurations 
with the same number of red cards on top and black cards on bottom has the same likelihood 
of occurrence. Therefore the total variation distance after a single 2-shuffle is given by 



(15) 




2n+l _ I 



n—1 n- 




2* + 2^ - 1 



1 



22n 



2n-{i+j + 2) 
n-(i + l) 



Using this formula, the total variation after a single 2-shuffle of a deck with 26 red and 
26 black cards is 0.579, which agrees with the numerical approximations of Conger and 
Viswanath in [10] . Conger and Viswanath have used their Monte Carlo approximation to 
get useful total variation numbers. Their results show that total variation convergence takes 
place much faster than separation convergence in the red-black case. For 52 cards, after 
1,2,3,4 shuffles it is .579, .360, .208, .105, respectively, decreasing by a factor of two from 
then on. 

Asymptotic results for the separation distance for red-black configurations appear in the 
following section. 



4. Approach to uniformity in separation for general decks 

In this section we work with general decks containing Di cards labelled i, 1 < i < m. 
The following lemma shows that the separation distance is always achieved by reversing the 
initial deck configuration. Note this is equivalent to Theorem 2.1 from |10j . 

Proposition 4.1. Let D be a deck as above. After an a-shuffle of the deck with 1 's on top 
down to m 's on bottom, the most likely deck configuration is this initial deck and the least 
likely configuration is the reverse deck w* with m 's on top down to 1 's on the bottom. In 
particular, the separation distance is achieved for w* . 

Proof. Note first that the initial configuration can result from any possible cut of the deck 
into a piles. Moreover, from any given cut of the deck, the identity is at least as likely to 
occur as any other configuration. The first assertion now follows. The only cuts of the initial 
deck which may result in w* are those containing no pile with distinct letters. However, 
for all such cuts, each rearrangement of the deck is equally likely to occur. Therefore w* 
minimizes Qa{w) and so maximizes 1 — Qa{w)/U. □ 

The explicit formula for Qa{w*) given in Corollarv 13.21 facilitates exact computations of 
SEP(a) for decks of practical interest. Similarly, we can compute Qa{w*) for an arbitrary 
deck with Di i's, i = 1, . . . ,m. 
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Theorem 4.2. Consider a deck with n cards and Di cards labeled i, i = 1, . . . ,m. Then the 
separation distance after an a-shujfle of the sorted deck (1 's followed by 2 's, etc) is given by 

^-^[d ^..D )^ («-^-i)'''"n {{k,-k,.,r^-{k,-k,.r-lf^). 

0=ko<---<km-i<a, 



Proof. From Proposition 14. H w* may only result from cuts with no pile containing distinct 
cards and any such cut is equally like to result in any deck. Therefore Qa{w*) is given by 

71 \ 1 
a" \Ai,....Ar, 



E - 



AiH hAa=n 

A refines D 



where 'A refines D' means there exist indices fci, . . . , km-i such that Ai + ■ ■ ■ + Aj.^ = Di 
and, for i = 2, . . . , m — 1, ^A;i_i+i + • • • + A^. = Di. Just as in the proof of Theorem 13. II we 
may take the fcj's to be minimal so that the expression for Qa{w*) simplifies to 

^ m—l 

(16) - n ((^-^-i)""^ - (%-fe.-i-i)''o • 

0=feo<---<fcm-i<a i=l 

The result now follows from Proposition 14.11 □ 
Table 5. Separation distance for k shuffles of 52 cards. 



k 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


BD-92 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


.995 


.928 


.729 


.478 


.278 


blackjack 


1.00 


1.00 


1.00 


1.00 


.999 


.970 


.834* 


.596* 


.366* 


.204* 


.108* 


.056* 




1.00 


.997 


.997 


.976 


.884 


.683 


.447 


.260 


.140 


.073 


.037* 


.019* 




1.00 


1.00 


.993 


.875 


.605 


.353 


.190 


.098 


.050 


.025 


.013 


.006 


redblack 


.890 


.890 


.849 


.708 


.508 


.317 


.179 


.095 


.049 


.025 


.013 


.006 




1.00 


1.00 


.993 


.943 


.778 


.536 


.321 


.177 


.093* 


.048* 


.024* 


.012* 



Remarks on Table O We calculate SEP after repeated 2-shuffles for various decks using 
Theorem 14.21 (blackjack) 9 ranks, say A23456789, with 4 cards each and another rank, say 
10, with 16 cards; (Jk^^4ft) 4 distinct suits, say clubs, diamonds, hearts and spades, of 13 
cards each; (A<|k)the ace of spades and 51 other cards; (redblack) a two color deck with 26 
red and 26 black cards; and ([om[¥]^[5|) a deck with 5 cards in each of 5 suits. The entries 
in Table [5] indicated by * were provided by the referee using Remark 14.91 below. 

Proposition 14.11 may be used with the Conger- Viswanath formula in ([6|) to give a simple 
expression for separation after an a-shuffle for a deck of size h + n with cards labelled 
1,2, ... ,h and n cards labelled x: 



SEP(a) 



1 



(re + /t) ■ • • (n + 1) 

r,n+h 



a-l 

E 

k=h-l 



k 

h-l 



{a-l-ky. 
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Now we derive a basic asymptotic tool, Proposition 14.31 which allows asymptotic approx- 
imations for general decks. As motivation, consider again the case of one card mixing, i.e. 
begin with n cards with the ace of spaces at the bottom of the initial deck. How many 
shuffles are required to randomize the ace of spades? Recall from Corollary 12.41 that the 
chance that the ace of spades is at position i from the top after an a-shufHe is given by 



k=l 

with the convention 0*^ = 1. Therefore from Proposition 14.11 we have 

a 

(17) SEP(a) = 1 - nQail) = 1 - ^ " I)""'- 



a" 

k=l 



Exact calculations when n = 52 are given in Table O 

Proposition 4.3. Let a be a positive real number, and let r and s be natural numbers with 
r, s >2. Let ^ be a real number in [0, 1]. Then 



S{a,i;r,s) := -L- ^ {^k + 0'' {a - k - if 



a' 

0<k<a-^ 



r\s\ 9 r!s! / I 1 

+ 7TT r + 



(r + s + 1)! 6a (r + s — 1)! Vr — 1 s — 1/ 
where 9 is a real number in [—1, 1]. 

Proof. Put f{x) = x'^{l — x)^ for x G [0, 1] and f{x) = otherwise. The sum that we wish 
to evaluate is 

(18) /((^ + ^)/«) = « E 

by the Poisson summation formula. Here, we write e{x) = e^^^*^ and f{y) = f {x)e{—xy)dx 
denotes the Fourier transform. 
Now note that 

(19) /(O) = C x'^il-xydx ""'^^ 



(r + S + 1)! 



Further 



f{y) = I ^ x'\l - x)"e-2™2^dx = C /'(x)e-2-*^^dx 

»i 

/"(x)e-2^^^2'(ix. 





1 

upon integrating by parts twice, and since r, s > 2 we have /(O) = /'(O) = /(I) = /'(I) = 0. 
Therefore 

471-2 y 2 

Now 
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and so 

\f(x)\dx 

r!s! / 2 2 ' 

+ 







(r + s — 1)! \r — 1 s — I J 
Combining the above estimates with (jlSp and (jl9p we conclude that our sum equals 



r!s! 



a- 



e r!.s! /I 1 \ ^ 1 

^ 27r2a(r + s- 1)! U-l ^ s-1 J 



'(r + s + 1)! 27r2a(r + s 

for some € [—1, 1]. Since X^^i = vr^/G the Proposition follows. □ 

Now suppose we have n red cards and n black cards, so 2n cards altogether, with the red 
cards starting on top. In this case, the uniform distribution U^w) = U = Again 
we use Proposition 14.11 this time with Corollary 13.21 to give a formula for the separation 
distance, 

(20) SEP(a) = 1 - ^ = 1 - ^ ^(a - fc)" (A:" - (A; - 1)") 

For exact computations when 2n = 52, see Table [5l We now use Proposition 14. 31 to calculate 
asymptotic expressions for this separation distance. 

Corollary 4.4. For 2n cards starting with n red cards on top, we have, with a = 1 — 1/a 

SEP(a) = 1 - -^(1 - a2"+i) + - 
ln + 1 6a [n — 2) 

for some real number 9 € [—1, 1]. In particular, for n large with a = 2'°S2(2")+c^ 

SEP(a) = 1 - 2^ (l - + o(i). 

Proof. Note that 

k=l 

k=i ^ -"^ 

= i /' E(« - 1 + ^ - - 1 + ^)T{k - 1 + iT-'dt 

•'^ k=0 

Using Proposition 14.31 we see that the inner sum over k above equals 
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Using these observations in (j20p we obtain that SEP (a) is given by 

1 - /■ + ^ 2„(2„ 1)(2„ 3) (^^)--^, 

Jo \ a J oa^ \n — \)\n — l) Jo ^ / 

With a httle calculus the Corollary follows. □ 

The approximation 



(21) (^"^^j^ia-kTih^-ik-lT) 
V / I — 1 



2n + 1 

which is the basis of our Corollary above is more accurate than suggested by the simple 
error bounds that we have given. For example, when n = 26 and a = 16, the actual 
separation distance (given in Table [5]) differs from the approximation of the Corollary by 
about 7x 10~^^. Put differently, note that the LHS and the RHS of (j2ip are both polynomials 
in a of degree 2n, and in fact the coefficients of both polynomials match for all degrees 
between n and 2n. 

Before moving to general decks, we establish a generalization of Proposition 14.31 

Proposition 4.5. Let m > 2 and a be natural numbers, let ^i, . . ^^n be real numbers in 
[0,1]. Let Ti, . . ., TjYi be natural numbers all at least r > 2. Let 

Sm{a]i,r)= V (ai + 6r---(am + Ur'". 



ai,...,am>0 
ai+...+am,=a 



Then 



S„.(a; r) - , ^ " ' ' u + 6 + • • • + i 

[Ti + . . . + Tm + m - ly. 

^m-l\( 1 \3{a + ii + ...+ ^^Yi + -+rm+m-l-2j 



Sn!....™iE("7') 



3(r-l)/ {ri + ...+rm + m-l-2j)\ 



Proof. We establish this by induction on m. The case m = 2 follows from Proposition 14.31 
taking there a to be what we would now call a + + ^2- Let now m > 3 and suppose the 
result has been established for m — 1 variables. Now 

(22) Sm{a;(,,r) = ^ Sm~i{a - ai;[,r) 

ai=l 

with ^ = (^2, • • • ) ^m) and r = {r2,..., rm), and interpreting the terms with ai > a as being 
0. Using the induction hypothesis we have that 



Sm-i{a - ai;£,,f) 

rr 

m-2\f 1 v(«-"i+^2 + ...+U)"2+-+"™+"'~^"^^' 



rm + m-2)i 

m-2 



i yV3(r-l); (r2 + ...+r„+m-2-2j)! 

Note that the above estimate is valid even if a + ^2 + • • • + Cm — 1 > ai > o since the RHS 
is larger than the main term that is being subtracted in the LHS. We use this estimate in 
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(j22p . and then invoke Proposition 14.31 to handle each of the m — 1 new sums that arise. 
Thus, the contribution of the main term above is, for some \9\ < 1, 

, ^ ^i'---^^' — (a + 6 + • . . + ur-^-'-'--^"'-'+ 

[ri + . . . + Vm + m - ly. 

. , (fl+gi + ...+gn^r+-+^-+'"-^ 

3(r - 1) ' ™' (ri + . . . + + m - 3)! 
while the j-th term on the RHS contributes 

,'m-2\( 1 V/(a + ei + ...+Cm)"^+-+"'"+™"^"2^' 



j y\3(r-l)/ V {ri + ... + rm + m-l-2j)\ 

^ 1 fa + fi + ... + f^ri+-+^'"-^-2j-2 



3(r - 1) (n + . . . + + m - 1 - 2j - 2)! 

Using these in (I22p and the above estimate, and using the triangle inequality, and that 
p-i^ = ("^-2) + ^^^^-^^ Proposition. □ 

Consider now a general deck of n cards with Di I's followed by D2 2's and so on ending 
with Dm m^s. Recall that the separation is maximum for the reverse configuration of the 
deck, and that probability is given in Theorem 14.21 We now use Proposition 14.51 to find 
asymptotics for that separation distance. The following is our 'rule of thumb.' 

Theorem 4.6. Consider a deck of n cards of m-types as above. Suppose that Di > d > 3 
for all I < i < m. Then the separation distance is 

where rj is a real number satisfying 

/ n? \m-l 

'''' - 3id-2){a-m+iy) " ^' 

Proof. Recall the expression for the separation distance given in Theorem 14.21 To evaluate 
this, we require an understanding of 

m—1 

E ag-Hiaf^-ia.-lfn 

ai+...+am=a. j=l 
a,>l 

= /•••/ E «^ n {dm - 1 + c,f^-'dc,) . 

ai+...+am=a j=l 
aj>l 

We now invoke Proposition 14.51 Thus the above equals for some |0| < 1 

^{a-{m-l)+Ci_+... + Cm-ir 
n! 



^0 



Y[D \ I ■■■ I (^ ^ ^ - 1) ^ C,l^ . . . ^ C,m-1) ^ 
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We may simplify the above as 

n2 \m-l ^\D,\---Drr,\ 



1 + 



r / \ m-i 1 \ -t^i- 

iv^3(d-2)(a-m + l)^) ~^}) 



3{d-2){a-m + iyJ )J n! 

/ .../ (a-m + l + 6 + -- -+ U-iTd^i ■ ■ ■ dU-i, 
Jo Jo 



and evaluating the integrals above this is 

2 \ m-l 



i 



The Theorem follows. □ 

Remark 4.7. For simplicity we have restricted ourselves to the case when each pile has at 
least three cards. With more effort we could extend the analysis to include doubleton piles. 
The case of some singleton piles needs some modifications to our formula, but this variant 
can also be worked out. 

Remark 4.8. From Theorem 14.61 one can show that for a general decks as above, one needs 
a of size about nm before the separation distance becomes small. We note that when a is 
of size about nm, the quantity r] appearing in Theorem 14.61 is of size about l/{m{d — 2)), so 
that the estimates furnished above represent a true asymptotic unless both m and d happen 
to be small. In other words, when we either have many piles, or a small number of thick 
piles. Theorem 14.61 gives a good asymptotic. 

Remark 4.9. While asymptotic. Theorem 14.61 is astonishingly accurate for decks of practical 
interest. For example, comparing exact calculations in Table [5] with approximations using 
this rule of thumb in Table [1] shows that after only 3 shuffles, the numbers agree to the 
given precision. Moreover, the simplicity of the formula in Theorem 14.61 allows much further 
computations than are possible using the formula in Theorem 14. 2i 

We now give a heuristic for why our rule of thumb is numerically so accurate; this was 
hinted at previously in our remark following Corollary 14. 4[ Let /c > be an integer, and 
define 

oo 

A(z)=^rV, 

r=0 

with the convention that 0*^ = 1. Thus fo{z) = 1/(1 — z), fi{z) = z/{l — z)^, and in general 
fk{z) = Aj^{z)/{1 — z)^~^^ where A}^{z) denotes the /c-th Eulerian polynomial. The sum 
over ai , . . . , am appearing in our proof of Theorem 14.61 is simply the coefficient of in the 
generating function (1 — z)"^~^ f£i-^{z) • • • fDmiz)- Our rule of thumb may be interpreted as 
saying that 

(23) (1 - zT-'foAz) ■ ■ ■ fD^z) « ,^^;"^7:, (1 - Z) — Vn+™-l(^). 

[n + m — 1)1 

To explain the sense in which (j23p holds, note that fkiz) extends meromorphically to the 
complex plane, and it has a pole of order A: + 1 at z = 1. Moreover it is easy to see that 
fk{z) — — z)^~^^ has a pole of order at most k at z = 1. Therefore, the LHS and RHS 
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of (j23p have poles of order n + 1 at z = 1, and their leading order contributions match. 
Therefore the difference between the RHS and LHS of (123p has a pole of order at most n 
at z = 1. But in fact, this difference can have a pole of order at most n — d at z = 1, and 
thus the approximation in (j23p is tighter than what may be expected a priori. To obtain 
our result on the order of the pole, we record that one can show 



Conger and Viswanath note that the initial configuration can affect the speed of conver- 
gence to stationary. In this section, we investigate this for a deck with n red and n black 
cards. Consider first starting with reds on top. If the initial cut is at n (the most likely 
value) then the red-black pattern is perfectly mixed after a single shuffle. More generally, 
by Corollary 13.31 the chance of the deck w resulting from a single 2-shuffle of a deck with n 
red cards atop n black cards is given by 



Consider next the result of 2-shuffles on the alternating deck red-black-red-black-- • • . 
As motivation, we recall a popular card trick: Begin with a deck of 2n cards arranged 
alternately red, black, red, black, etc. The deck may be cut any number of times. Have the 
deck turned face up and cut (with cuts completed) until one of the cuts results in the two 
piles having cards of opposite color uppermost. At this point, ask one of the participants 
to riffle shuffle the two piles together. The resulting arrangement has the top two cards 
containing one red and one black, the next two cards containing one red and one black, and 
so on throughout the deck. This trick is called the Gilbreath Principle after its inventor, 
the mathematician Norman Gilbreath. It is developed, with many variations, in Chapter 4 



From the trick we see that beginning with an alternating deck severely limits the possi- 
bilities. Which start mixes faster? The following developments both explain the trick and 
give a useful formula for analysis. 

Lemma 5.1. The number of deck patterns resulting from a cut with an odd number of 
cards in both piles followed by a riffle shuffle is 2". Similarly, the number of deck patterns 
resulting from a cut with both piles even followed by a riffle shuffle is 2^^^ . 

Proof. For the case of an odd cut, the last two cards after the riffle shuffle must be a red 
and a black card. No matter what piles these two cards fell from, the next two cards must 
also consist of one red and one black card. Continuing on, the possible resulting decks are 
exactly those where the ith and i + 1st cards have different colors for z = l,3,...,2n — 1. 
The number of such decks is exactly 2", since each of the order of each of the n pairs is 
independent. 

For an even cut, we proceed by induction noting that the case when n = 1, 2, 3 are easily 
solved by inspection. In this case, the only resulting decks will necessarily begin with a 
red card and end with a black card. The number of decks beginning with two red cards 
or ending with two black cards is determined by the previous case since removing the top 
or bottom card from each pile results in piles with an odd number of cards, giving 2"'"^ 
possibilities. However, we must discount the over counted case of decks beginning with two 




5. Comparing 2-shuffles with different starting patterns 




of [23]. 
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red cards and ending with two black cards, and, by induction since the piles are again both 
even, there are 2"""^ such decks. Finally, the remaining case must be decks beginning and 
ending with a red card followed by a black card. In this case, again, the piles remain even 
and by induction the number of such decks is 2"'"^. Therefore the total count for cuts with 
both piles even is 2"-i - 2""^ + 2"-^ = 2"-i. □ 

The proof of the lemma shows exactly why the card trick is a success: to have different 
colors on the top of the two piles, the cut must have been odd. Therefore the first two cards 
dropped consist of one red and one black, and the next two cards dropped consist of one 
red and one black, and so on. Also from the lemma, we see that the only deck that can 
result from either an odd cut or an even cut is the identity. 

Proposition 5.2. The chance of a 2-shujfle of the alternating deck resulting in a deck 
configuration w is given by 

2"-i + 2" ifw = WQ 
2"-i ifw£0\wo, 
2" ifw£E\wo, 
otherwise, 



(24) 2^" • Q2{w 



where wq is the initial alternating deck and O (respectively, E) is the set of decks that can 
result from riffling together the two piles from cutting the alternating deck when both piles 
have an odd (respectively, even) number of cards. 

Proof. Let w,u € O. Then the total number of ways w can result from any odd cut is equal 
to the total number of ways u can result from any odd cut. The same is true replacing O 
with E and "odd" with "even" . From the binomial identity 

k=0 k odd k even 

we must have both the right-hand sums equal to 2^"~^. Therefore, by Lemma 15.11 the 
total number of ways w can result from an odd cut (assuming it can) is 2^"~^/2" = 2"~^, 
and, similarly, the total number of ways w can result from an even cut (assuming it can) is 
22n-i/2"-i = 2". □ 

It follows from p4p that the separation distance for a 2-shuffie is SEP(2) = 1 when n > 3. 
Furthermore, since ( ^) > 2*^, we can compute the total variation of a 2-shuffle to be 



(25) \\Q2-U\ 



2" + 2"-^ - 1 



TV - o ^ /2n 



n I 



which goes to .5 exponentially fast as n goes to infinity. In contrast, starting with reds 
above blacks, asymptotic analysis of (llSp shows that the total variation tends to 1 after a 
single shuffle when n is large. Thus an alternating start leads to faster mixing. 
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Appendix A. Random walks on groups 

In this appendix, we reformulate shuffling in terms of random walks on the symmetric 
group Sn, so that our investigation of particular properties of a deck becomes the quotient 
walk on Young subgroups of 5„. 

Let G be a finite group with Q{g) > 0, XlggG Qid) = 1 a probability on G. The walk in 
([T]) may be called the left walk since it consists of repeatedly picking elements independently 
with probability Q, say gi, §2, 93, ■ ■ and, starting at the identity Iq, multiplying on the 
left by gi. The generates a random walk on G, 

1g, 91, 9291, 939291, ■■■■ 

By inspection, the chance that the walk is at g after k steps is Q*^{g), where Q^{g) = Si^^g. 

An algebraic method of focusing on aspects of the walk is to use the quotient walk. Let 
H < G he a subgroup of G, and set X = G/H = {xH} to be the set of left cosets of H in 
G. The quotient walk is derived from the walk above by simply reporting to which coset 
the current position of the walk belongs. The quotient walk is a Markov chain on X with 
transition matrix given by 

(26) K{x, y) = QiyHx-^) = J] Q{yhx~^). 

heH 

Note that K is well-defined (i.e. independent of the choice of coset representatives) and 
that K is doubly stochastic. Thus the uniform distribution on X, U{x) = \H\/\G\, is a 
stationary distribution for K. The chain K is reversible if and only if Q is symmetric (i.e. 
Q{g) = Q{g~^))- Note that this is not the case for riffle shuffles. While intuitively obvious, 
the following shows the basic fact that powers of the matrix K correspond to convolving 
and taking cosets. 

Proposition A.l. For Q a probability distribution on a finite group G and K as defined 
in [2S\) . we have 

K\x,y) = Q*\yHx-^). 

Proof. The result is immediate from the definitions for I = 0, 1. We prove the result for 
I = 2, the general case being similar. Note that 

K\x,y) = Y,K{x,z)K{z,y) = Y,Y. Q{zhix-^)Q{yh2Z-^). 

z z hi,h2 

Setting /i2 = hh^^ , noting that zhi runs over G as z runs over X and hi over H, and setting 
9i = 9x~^, we have 

K\x,y) = Y.Y.Q^9x-')Q{yhg-') 

h g 

= Qi9i)Q{yhx''g^') = Q\yHx-^). 

h gi 
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□ 

We may identify permutations in 5„ with arrangements of a deck of n cards by setting 
a{i) to be the label of the card at position i from the top. Thus the permutation 2 1 4 3 is 
associated with four cards where "2" is on top, followed by "1", followed by "4", and finally 
"3" is on the bottom. If we consider the cards labelled 1, 2, . . . , A; to be "red" cards, and 
the cards labelled k + 1, k + 2, . . . ,n to he "black" cards, with all cards of the same color 
indistinguishable, the coset space 

X = Sn/ {Sk X Sn-k) 

is naturally associated with the (^) arrangements of red and black unlabeled cards. Here, 
of course, we identify an element of 5^ x Sn^k ^ as permuting the first k and last n — k 
cards among themselves. Similar constructions work for suits or values. Thus Proposition 
I A. II shows that the processes studied in the body of this paper are Markov chains. 

Appendix B. Shuffling by random transpositions 

Let L'^{X) = {/ : X — > C} be the set of complex- valued functions on X with inner 
product defined by 

(27) (/i|/2) = ^E/i(^)^- 

If K is symmetric, then real-valued functions may be used. The transition matrix K 
operates on Lp' via 

(28) Kf{x) = Y,K{x,y)f{y). 

y 

In the present case, L^(X) = Ind^(l), the usual permutation representation of G acting on 
left cosets X = G/H, with Tgf{x) = f{g~^x). By construction, the action of G commutes 
with K, i.e. 

(29) Tg{Kf) = K{TJ) 

for all / G L^(X) and all g G G. This implies that group representation theory can be 
used to reduce the operator K (or diagonalize K in the case when K is symmetric). This 
classical topic is well developed in Fassler-Steifel [20] and Boyd, et. al. [6]. 

Let G denote the set of irreducible representations of the finite group G. For p G, the 
Fourier transform of / € L'^{G) at p is defined by 

g&G 

As usual, Fourier transform turns convolution into products, i.e. 

Schur's lemma implies that the uniform distribution has zero transform 

jj( \ _ { ^ if p is trivial, 
1^ otherwise. 
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The Fourier inversion theorem reconstructs / from {/(/o)} by 

m = 1^ Y^dhn.Tv [f{p)p{g-')) . 

For background, see Serre [29|, Diaconis [13] or Ceccherini, et. al [7] where many apphcations 
are given. 

Suppose the induced representation L'^{X) decomposes into irreducibles as 
(30) L2(X)=0yp®"^ 

Then since K commutes with G, K sends each of the spaces V^"''' into itself. Further 
reductions may be possible if Q has suitable symmetries. The following widely studied 
special case is relevant. 

Definition B.l. The pair H < G is a. Gelfand pair if L^(X) is multiplicity free, i.e. all ap 
in (1301) are either or 1. 



For example, when 1 < A; < n/2, x Sn-k ^ 5^ is a Gelfand pair with 

k 

(31) l2(X) =05"-^'\ 

5i=0 

Recall that the irreducible representations of 5„ are indexed by partitions A of n. If 
denotes the Ath representation (Specht modules), the sum in (I3ip runs over partitions into 
two parts with the smaller part at most k. For further background on Gelfand pairs, 
including examples and applications, see [T4t [7]. 

Now we study a deck of red and black cards after repeated random transposition shuffles. 
Recall that Diaconis-Shahshahani [19] show that it takes ^n(log(n) + c) shuffles to mix n 
distinct cards. To be precise, the measure on Sn that drives the walks is 

1/n if fj = id, 
Qia) = { 2/n2 ifa = (i,j), 
otherwise. 

Throughout the following, all walks begin at the identity permutation, and we use the 
convention that 7r(i) is the label of the card at position i. 

First, we follow the position of the top card; i.e. the two of hearts is the only red card 
followed by n — 1 black cards. The transition matrix for this walk is given by 



(32) P{i,f) 



n 

2 



2 if « 7^ J- 



Note that this is symmetric, with n(i) = 1/n as the stationary distribution. 
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Proposition B.2. For the transition matrix P{i,j) above and all I >0, we have 

I 



(33) 



n V n J \ n , 



1 

n 



n I n 



From this it follows that 



SEP(0=(1-^') and \\P -U\\tv = U-^^ ~ ^ 



Proof. The results for the separation and total variation distances follow from p3|) and the 
definitions. It is possible to give a direct combinatorial argument for (j33p . but the following 
representation theoretic argument generalizes readily to find similar formula for j-tuples of 
cards. 

The random transposition measure Q is constant on conjugacy classes of 5„ and so acts 
on each irreducible representation as a constant times the identity. These constants are 
given explicitly by Diaconis-Shahshahani [19], involving characters and dimensions of the 
representation. Consider the operator K{a,T) = Q{Ta~^) on the regular representation. 
The function /(cr) = '^i,(T(i) — 1/n lies in the n — 1 copies of the n — 1-dimensional repre- 
sentation corresponding to the partition (n — 1, 1). The operator K acts on this space by 
multiplication by 1 — 2/n. Thus 

card labelled 1 ^ 

Prr I at position i j = K^ficr) = ( 1 

after / shuffles 



n 



n 



1 



n 



1 



n 



Here a is the starting arrangement. Evaluating the right-hand side gives (j33p . 



□ 



Next we consider the deck with N = 2n cards where the (original) top n cards are red 
and the (original) bottom n cards are black. In this case, we think of the the random 
transposition operator acting on the quotient space Sj\[/Sn x 5„. For x,y £ Sjy/Sn x 5„, 
the induced Markov chain is 

if X ^ y differ by a transposition, 



(34) 



K{x,y) 



N 



+ 



1 

("(n-1))^ 







ifx = y, 
otherwise. 



This chain has uniform stationary distribution n(x) = 

The chain K is invariant under Sn, i-e. K{x, y) = K{ax, ay), so the distance to stationary 
does not depend on the original configuration. As noted earlier, the pair Sn x Sn,Sisf 
is a Gelfand pair, so (f3T]) allows an easy determination of the eigen values and rate of 
convergence. 

Proposition B.3. For the Markov chain K on Sn/Su x Sn, the eigen values are 



jf-{N-j)+f-3j) 
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j = 1, . . . ,n. The multiplicity of f3j is rrij = 
A such that if I = lN{log N + C), then 



/N-l 



■ ) . Moreover, there is a universal constant 
< Ae-^/\ 



TV 



Proof. The operator K acts on {Sn /Sn x 5„) as the element of the group algebra 

As shown in [17J, this element acts on the irreducibles S^~^'^ as a constant times the identity, 
with the constant being Pj and the multiplicity being the dimension of S^~^'K This proves 
the first part. 

The remaining claims can be proved following the argument in |17j: bound the total 
variation distance by the norm, express this in terms of the eigen values and average 
over the starting state. This reduces the problem to bounding 

n 

The lead term in this is 

(A'-l)(l-|,)"<e- 

For / of the form ^N{\ogN + c), the other terms are smaller and sum in a reasonably 
standard fashion. The terms are the same as in [T7j, so we suppress further details. □ 

Remark B.4. It is easy to give a lower bound showing that after I = ^N{\og N + c) steps 
the distance to stationary is bounded away from for large A^. Further, in this case, the 
distance tends to 1 if c = cat tends to — oo. 

These results show that for red-black mixing, there is a total variation cutoff at log N . 
Note that single card mixing does not have a cutoff, recalling that in Proposition IB. 21 the 
deck has size n and in Proposition IB. 31 the deck has size N = 2n. 
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